Blood Gas and Pressure Thresholds

Basis for Thresholding

Reducing the timeseries measurements for patients’ blood gas and pressure levels into scalar-valued indicators or quantities that best correlate with outcomes using thresholds in their values will be helpful for several reasons. Firstly, having single values for the gas and pressure levels makes them more amenable to modeling alongside the existing covariates like age, gender, and marshall/GCS scores. Secondly, models using these scalar values are necessary to establish confidence intervals for effects (bayesian credible intervals are possible without reduction to scalar values but that may be difficult to share). Lastly, using the thresholds makes everything easier to explain and visualize.

Example for how thresholds could be used to reduce timeseries measurements to single values:

Given timeseries measurements \(x_1, x_2, x_3, ..., x_n\), compute the percentage of time spent above and below threshold value \(x_t\) as \(\frac{N_1}{N_1 + N_2}\) where \(N_1 = |{x_i;x \lt x_t}|\) and \(N_2 = |{x_i;x \geq x_t}|\)

Threshold Finding Model (Same as before)

Logistic regression with special function for taking in timeseries measurements and then estimating parameters for that function that map to gas/pressure thresholds (Stan Sampling Model):

\[ logit(Pr(y_i = 1)) = \alpha + \beta \cdot X_i + f(G_{ij}) \]

where

\[ X_i = [{Gender}_i, {Age}_i, {CommaScore}_i, {MarshallScore}_i], \] \[ y_i = \{ 0 \text{ if }{GOS}_i \in [1, 2, 3], 1 \text{ if }{GOS}_i \in [4, 5] \} \]

and

\[ f(G_i) = \frac{1}{n_i} \sum_j^{n_i}{ \frac{c_1}{1 + e^{-c_2(G_{ij} - c_3)}} + \frac{c_4}{1 + e^{-c_5(G_{ij} - c_6)}} } \] \[ n_i = \text{ length of timeseries for patient }i \]

Thresholds w/ No Covariates

The following models were fit using only one gas/pressure measurment:


Thresholds w/ Covariates (ICP & PbtO2)

The following models were fit using one gas/pressure measurment and all covariates (age, sex, gcs, marshall):

Thresholds w/ Covariates (PaO2 & PaCO2)

The following models were fit using one gas/pressure measurment and all covariates (age, sex, gcs, marshall):

Threshold Summary

Based on the inferred thresholds and their proximity to some known limits, the following values were used for each quantity as the low, high, and safe ranges:

Quantity Low.Range Safe.Range High.Range
PaCO2 [0, 28) [28, 42] 42+
PaO2 [0, 300) [300, 875) 875+
PbtO2 [0, 20) [20, 70) 70+
ICP NA (-Inf, 20) 20+




Gas and Pressure Model Statistical Performance

Model Selection via Information Criteria

Exhaustive model selection via AICc with model space given by:

gos ~ age + sex + marshall + gcs + icp1_20_inf + pao2_0_300 + pao2_875_inf + pbto2_0_20 + pbto2_70_inf + paco2_0_28 + paco2_42_inf

Note that only main effects in a linear model were considered because models with nonlinearities and interactions showed no major improvements. Also, the gos outcome was binarized into two classes, one for bad outcomes (original gos in [1,2,3]) and one for good outcomes (original gos in [4,5]).

Modeling call:

glmulti(f.glmulti, data=d.glmulti, family='binomial', level=1, crit = AICc)

IC-Based Model Selection Results

Best model found after exhaustive search (note that PaO2 is absent):

[1] “Number of observations = 172”

  Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.3441 0.2173 -6.1865 0 ***
age -0.6879 0.221 -3.113 0 ***
gcs 0.4343 0.2089 2.0792 0.04 *
icp1_20_inf -0.6278 0.3717 -1.6892 0.09 .
paco2_42_inf 0.4319 0.1796 2.4044 0.02 *

Impact of PbtO2

While a model that includes all predictors (ICP, PaO2, PaCO2, and PbtO2) shows weak significances for each, simpler models including smaller groups of predictors show stronger relationships (though still fairly weak overall).

For example, this model more directly answers the question “What is the impact of hypoxic PbtO2 burden while controlling for known predictors of outcome?”:

Formula = gos ~ age + sex + marshall + gcs + icp1_20_inf + pbto2_0_20 + pbto2_70_inf
N = 172

  Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.365 0.2227 -6.1294 0 ***
age -0.5839 0.2299 -2.5402 0.01 **
sex 0.2059 0.2027 1.0157 0.31
marshall -0.3332 0.2378 -1.4014 0.16
gcs 0.2894 0.2171 1.3327 0.18
icp1_20_inf -0.6222 0.3831 -1.624 0.1 .
pbto2_0_20 -0.3321 0.2318 -1.4328 0.15
pbto2_70_inf -0.0864 0.191 -0.4522 0.65

Now even in the presence of ICP, PbtO2 is still significant at a 95% level (vs a p-value of .07 in the best model)

Impact of PbtO2 w/ PaO2

To answer the question “Is PbtO2 better than PaO2?”, a model like that above that also includes PaO2 could be considered:

Formula = gos ~ age + sex + marshall + gcs + icp1_20_inf + pao2_0_300 + pao2_875_inf + pbto2_0_20 + pbto2_70_inf
N = 172

  Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.3974 0.2284 -6.118 0 ***
age -0.6234 0.2374 -2.6261 0.01 **
sex 0.2244 0.2051 1.0945 0.27
marshall -0.319 0.2405 -1.326 0.18
gcs 0.2906 0.2192 1.3261 0.18
icp1_20_inf -0.643 0.3906 -1.6462 0.1 .
pao2_0_300 -0.0431 0.2058 -0.2093 0.83
pao2_875_inf -0.3381 0.2944 -1.1482 0.25
pbto2_0_20 -0.3457 0.2358 -1.466 0.14
pbto2_70_inf -0.085 0.1889 -0.4501 0.65

At least on the scale of statistical significance, an argument could be made that PbtO2 is better than PaO2.

Statistical Performance Conclusions

Information criteria scores and statistical signifiances of different predictors show fairly weak evidence that PbtO2 is both important and a better predictor than PaO2, even after controlling for known predictors of outcome like ICP burden, Age, Gender, GCS, and Marshall scores.

Despite the above, I would argue that using information criteria and statistical significances is probably not the best way to draw these conclusions (especially given that the evidence is not overwhelming) and that a better approach may be assessing model performance in leave-one-out cross validation. A predictive accuracy measure like that would likely give a better sense of the quality of the different predictors on a more practical scale. To that end, most of what that follows will examine predictive performance instead.





Gas and Pressure Model Predictive Performance

Predictive Performance

All models below were tested in leave-one-out cross validation for the sake of comparing predictive performance measuremes between them like ROC-AUC:

model.name variables
icp icp1_20_inf
paco2 paco2_0_28, paco2_42_inf
pao2 pao2_0_300, pao2_875_inf
pbto2 pbto2_0_20, pbto2_70_inf
pao2_pbto2 pao2_0_300, pao2_875_inf, pbto2_0_20, pbto2_70_inf
icp_paco2 icp1_20_inf, paco2_0_28, paco2_42_inf
icp_pao2 icp1_20_inf, pao2_0_300, pao2_875_inf
icp_pbto2 icp1_20_inf, pbto2_0_20, pbto2_70_inf
icp_pao2_pbto2 icp1_20_inf, pao2_0_300, pao2_875_inf, pbto2_0_20, pbto2_70_inf
icp_pao2_pbto2_paco2 icp1_20_inf, pao2_0_300, pao2_875_inf, pbto2_0_20, pbto2_70_inf, paco2_0_28, paco2_42_inf
demo age, sex
wcov_icp age, sex, marshall, gcs, icp1_20_inf
wcov_paco2 age, sex, marshall, gcs, paco2_0_28, paco2_42_inf
wcov_pao2 age, sex, marshall, gcs, pao2_0_300, pao2_875_inf
wcov_pbto2 age, sex, marshall, gcs, pbto2_0_20, pbto2_70_inf
wcov_pao2_pbto2 age, sex, marshall, gcs, pao2_0_300, pao2_875_inf, pbto2_0_20, pbto2_70_inf
wcov_icp_paco2 age, sex, marshall, gcs, icp1_20_inf, paco2_0_28, paco2_42_inf
wcov_icp_pao2 age, sex, marshall, gcs, icp1_20_inf, pao2_0_300, pao2_875_inf
wcov_icp_pbto2 age, sex, marshall, gcs, icp1_20_inf, pbto2_0_20, pbto2_70_inf
wcov_icp_pao2_pbto2 age, sex, marshall, gcs, icp1_20_inf, pao2_0_300, pao2_875_inf, pbto2_0_20, pbto2_70_inf
wcov_icp_pao2_pbto2_paco2 age, sex, marshall, gcs, icp1_20_inf, pao2_0_300, pao2_875_inf, pbto2_0_20, pbto2_70_inf, paco2_0_28, paco2_42_inf
wcov_none age, sex, marshall, gcs

GLM AUC Results

model auc tp fp tn fn n
17 wcov_icp_paco2 0.6572922 4 8 121 39 172
12 wcov_icp 0.6435911 4 5 124 39 172
13 wcov_paco2 0.6435911 5 7 122 38 172
19 wcov_icp_pbto2 0.6426897 6 6 123 37 172
15 wcov_pbto2 0.6408870 4 4 125 39 172
21 wcov_icp_pao2_pbto2_paco2 0.6381828 8 12 117 35 172
22 wcov_none 0.6342167 1 1 128 42 172
18 wcov_icp_pao2 0.6304309 4 5 124 39 172
20 wcov_icp_pao2_pbto2 0.6300703 7 7 122 36 172
16 wcov_pao2_pbto2 0.6289886 3 6 123 40 172
14 wcov_pao2 0.6239409 3 3 126 40 172
11 demo 0.5819362 0 0 129 43 172
2 paco2 0.5630070 2 4 125 41 172
6 icp_paco2 0.5557959 2 4 125 41 172
10 icp_pao2_pbto2_paco2 0.5289346 3 5 124 40 172
8 icp_pbto2 0.5164954 0 0 129 43 172
4 pbto2 0.5089237 0 0 129 43 172
9 icp_pao2_pbto2 0.4961240 0 0 129 43 172
5 pao2_pbto2 0.4867496 0 0 129 43 172
7 icp_pao2 0.4532180 0 0 129 43 172
3 pao2 0.4149991 0 0 129 43 172
1 icp 0.3861547 0 0 129 43 172

AUC Comparisons

Going beyond GLM models to more sophisticated black-box models, AUC numbers don’t change drastically indicating that little is being lost by ignoring interactions and nonlinearities:

AUC by Model Type
model GLM KNN RFT GBM
wcov_icp_pao2_pbto2 0.6300703 0.6021273 0.6001442 0.6183523
wcov_none 0.6342167 0.6453038 0.6208761 0.6352984
wcov_icp_pao2_pbto2_paco2 0.6381828 0.6372814 0.5949162 0.6486389
wcov_icp_pbto2 0.6426897 0.6078962 0.6110510 0.6414278
wcov_icp 0.6435911 0.6474671 0.6163692 0.6441320

GLM ROC Plots

ROC Curves from logistic regression models [Demo turning models on and off in order]:

GBM ROC Plots

ROC Curves from gbm models:

KNN ROC Plots

ROC Curves from nearest neighbor models:

Predictive Performance Conclusions

Changes in ROC curves and AUC values further indicate that PaO2 is a weaker predictor than PbtO2. However, the effect of all gases on a practical scale is still relatively weak, and even the highest performing classifiers offer a fairly minor lift over those built using only known predictors of outcome like GCS + Marshall scores and Age.





Sub Group Predictability

Looking for Interactions

One way to answer the question “Do patients with lower PaO2 show a greater association between PbtO2 and outcome?” would be to look for potential interactions between PaO2 and PbtO2.

One such model like this would be:

gos ~ age + marshall + gcs + sex + pao2_0_300 + pbto2_0_20 + icp1_20_inf + pao2_0_300:pbto2_0_20

  Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.3486 0.2244 -6.0104 0 ***
age -0.5686 0.2308 -2.463 0.01 **
marshall -0.3433 0.2372 -1.4476 0.15
gcs 0.2894 0.218 1.3274 0.18
sex 0.2024 0.2035 0.9946 0.32
pao2_0_300 -0.0154 0.2105 -0.0732 0.94
pbto2_0_20 -0.3143 0.2314 -1.358 0.17
icp1_20_inf -0.6084 0.3823 -1.5916 0.11
pao2_0_300:pbto2_0_20 -0.0891 0.2546 -0.35 0.73

Interaction Effects

How the effect of PbtO2 on predicted outcome probabilities changes with PaO2 levels:

More PbtO2 Interactions

Looking for similar interactions between PbtO2 and other predictors shows nothing new. For example, computing an exhaustive AIC search based on models that include the following predictors interacted with PbtO2 shows that only the PaO2 interaction is even remotely important:

Variables considered in search:

##  [1] "age"                    "marshall"              
##  [3] "gcs"                    "sex"                   
##  [5] "pao2_0_300"             "pbto2_0_20"            
##  [7] "icp1_20_inf"            "age.pbto2_0_20"        
##  [9] "marshall.pbto2_0_20"    "gcs.pbto2_0_20"        
## [11] "sex.pbto2_0_20"         "pao2_0_300.pbto2_0_20" 
## [13] "pbto2_0_20.icp1_20_inf" "gos"

Best Model:

  Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.2976 0.2104 -6.1674 0 ***
age -0.66 0.2124 -3.1072 0 ***
gcs 0.3881 0.2038 1.9044 0.06 .
icp1_20_inf -0.7207 0.3753 -1.9202 0.05 *

Comparative Modeling

Another way to look for interactions is to attempt to explain differences in predictive accuracy of a model including PbtO2 vs a model that does not include PbtO2.

For example, this is a linear model that attempts to explain the point-wise log likelihood ratio (in LOO CV) calculated between a model that includes all predictors except PbtO2 and one that is identical but also includes PbtO2:

## 
## Call:
## lm(formula = llr ~ ., data = .)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.49880 -0.04388  0.01129  0.06888  0.29892 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)
## (Intercept)  -0.0154208  0.0151151  -1.020    0.309
## age           0.0216344  0.0167330   1.293    0.198
## sex          -0.0009327  0.0157388  -0.059    0.953
## marshall      0.0042938  0.0169350   0.254    0.800
## gcs           0.0254917  0.0158662   1.607    0.110
## icp1_20_inf  -0.0021972  0.0162663  -0.135    0.893
## pao2_0_300    0.0022467  0.0156829   0.143    0.886
## pao2_875_inf  0.0123124  0.0162685   0.757    0.450
## 
## Residual standard error: 0.1982 on 164 degrees of freedom
## Multiple R-squared:  0.03309,    Adjusted R-squared:  -0.00818 
## F-statistic: 0.8018 on 7 and 164 DF,  p-value: 0.5869

Only PaO2 is intimated as being connected to the differences, and here is a plot of those differences:

Comparative Clusters

As one more way to verify that no other definitions of sub-groups benefiting from PbtO2 exist, a dimensional scaling algorithm like TSNE could be used to see if inferred clusters have a relationship with predictive differences:

Sub Group Predictability Conclusions

There is some really weak evidence that an interaction exists between PaO2 and PbtO2, implying that PbtO2 is a greater predictor of outcome when a patient experiences lower PaO2 levels.

There is also a good bit of evidence indicating that no other covariates of interest (ICP, Age, GCS, or Marshall) have such an interaction or any kind of relationship indicating clusters of behavior with respect to PbtO2.